A robust BFCC feature extraction for ASR system

نویسندگان

Ta-Wen Kuan

An-Chao Tsai

Po-Hsun Sung

Jhing-Fa Wang

Hsien-Shun Kuo

چکیده

An auditory-based feature extraction algorithm naming the Basilar-membrane Frequency-band Cepstral Coefficient (BFCC) is proposed to increase the robustness for automatic speech recognition. Compared to Fourier spectrogram based of the MelFrequency Cepstral Coefficient (MFCC) method, the proposed BFCC method engages an auditory spectrogram based on a gammachirp wavelet transform to simulate the auditory response of human inner ear to improve the noise immunity. In addition, the Hidden Markov Model (HMM) is used for evaluating the proposed BFCC in phases of training and testing purposes conducted by AURORA-2 corpus with different Signal-to-Noise Ratios (SNRs) degrees of datasets. The experimental results indicate the proposed BFCC, compared with MFCC, Gammatone Wavelet Cepstral Coefficient (GWCC), and Gammatone Frequency Cepstral Coefficient (GFCC), improves the speech recognition rate by 13%, 17%, and 0.5% respectively, on average given speech samples with SNRs ranging from -5 to 20 dB.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Speech Recognition in Noisy Environment Using Different Feature Extraction Techniques

In this paper, different feature extraction methods for speech recognition system such as Melfrequency cepstral coefficients (MFCC), linear predictive coefficient cepstrum (LPCC) and Bark frequency cepstral coefficients (BFCC) are implemented and the comparison is done based on average recognition accuracy. We suggest a noise robust isolated word speech recognition system which can be applied i...

متن کامل

Why do ASR Systems Despite Neural Nets Still Depend on Robust Features

To which extent can neural nets learn traditional signal processing stages of current robust ASR front-ends? Will neural nets replace the classical, often auditory-inspired feature extraction in the near future? To answer these questions, a DNN-based ASR system was trained and tested on the Aurora4 robust ASR task using various (intermediate) processing stages. Additionally, the training set wa...

متن کامل

Missing Feature Imputation of Log-spectral Data for Noise Robust Asr

In this paper, we present a missing feature (MF) imputation algorithm for log-spectral data with applications to noise robust ASR. Drawing from previous work [1], we adapt the previously proposed spectrographic reconstruction solution to the liftered log-spectral domain by introducing log-spectral flooring (LS-FLR). LS-FLR is shown to be an efficient and effective noise robust feature extractio...

متن کامل

Pitch-Synchronous Peak-Amplitude (PS-PA)-Based Feature Extraction Method for Noise-Robust ASR

A novel pitch-synchronous auditory-based feature extraction method for robust automatic speech recognition (ASR) is proposed. A pitch-synchronous zero-crossing peak-amplitude (PS-ZCPA)-based feature extraction method was proposed previously and it showed improved performances except when modulation enhancement was integrated with Wiener filter (WF)-based noise reduction and auditory masking. Ho...

متن کامل

Speech Representation Learning Using Unsupervised Data-Driven Modulation Filtering for Robust ASR

The performance of an automatic speech recognition (ASR) system degrades severely in noisy and reverberant environments in part due to the lack of robustness in the underlying representations used in the ASR system. On the other hand, the auditory processing studies have shown the importance of modulation filtered spectrogram representations in robust human speech recognition. Inspired by these...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

Artif. Intell. Research

دوره 5 شماره

صفحات -

تاریخ انتشار 2016

A robust BFCC feature extraction for ASR system

نویسندگان

چکیده

منابع مشابه

Speech Recognition in Noisy Environment Using Different Feature Extraction Techniques

Why do ASR Systems Despite Neural Nets Still Depend on Robust Features

Missing Feature Imputation of Log-spectral Data for Noise Robust Asr

Pitch-Synchronous Peak-Amplitude (PS-PA)-Based Feature Extraction Method for Noise-Robust ASR

Speech Representation Learning Using Unsupervised Data-Driven Modulation Filtering for Robust ASR

عنوان ژورنال:

اشتراک گذاری